28 research outputs found

    Reconstruction of gene regulatory networks from postgenomic data

    Get PDF
    Institute for Adaptive and Neural ComputationAn important problem in systems biology is the inference of biochemical pathways and regulatory networks from postgenomic data. The recent substantial increase in the availability of such data has stimulated the interest in inferring the networks and pathways from the data themselves. The main interests of this thesis are the application, evaluation and the improvement of machine learning methods applied to the reverse engineering of biochemical pathways and networks. The thesis starts with the application of an established method to newly available gene expression data related to the interferon pathway of the human immune system in order to identify active subpathways under di erent experimental conditions. The thesis continues with the comparative evaluation of various machine learning methods (Relevance networks, Graphical Gaussian Models, Bayesian networks) using observational and interventional data from cytometry experiments as well as simulated data from a gold-standard network. The thesis also extends and improves existing methods to include biological prior knowledge under the Bayesian approach in order to increase the accuracy of the predicted networks and it quanti es to what extent the reconstruction accuracy can be improved in this way

    Decrypting strong and weak single-walled carbon nanotubes interactions with mitochondrial voltage-dependent anion channels using molecular docking and perturbation theory

    Get PDF
    [Abstract] The current molecular docking study provided the Free Energy of Binding (FEB) for the interaction (nanotoxicity) between VDAC mitochondrial channels of three species (VDAC1-Mus musculus, VDAC1-Homo sapiens, VDAC2-Danio rerio) with SWCNT-H, SWCNT-OH, SWCNT-COOH carbon nanotubes. The general results showed that the FEB values were statistically more negative (p  (SWCNT-VDAC1-Mus musculus) > (SWCNT-VDAC1-Homo sapiens) > (ATP-VDAC). More negative FEB values for SWCNT-COOH and OH were found in VDAC2-Danio rerio when compared with VDAC1-Mus musculus and VDAC1-Homo sapiens (p  r2 > 0.97) was observed between n-Hamada index and VDAC nanotoxicity (or FEB) for the zigzag topologies of SWCNT-COOH and SWCNT-OH. Predictive Nanoparticles-Quantitative-Structure Binding-Relationship models (nano-QSBR) for strong and weak SWCNT-VDAC docking interactions were performed using Perturbation Theory, regression and classification models. Thus, 405 SWCNT-VDAC interactions were predicted using a nano-PT-QSBR classifications model with high accuracy, specificity, and sensitivity (73–98%) in training and validation series, and a maximum AUROC value of 0.978. In addition, the best regression model was obtained with Random Forest (R2 of 0.833, RMSE of 0.0844), suggesting an excellent potential to predict SWCNT-VDAC channel nanotoxicity.Brasil. Conselho Nacional de Desenvolvimento Científico e Tecnológico; 552131/2011-3Brasil. Conselho Nacional de Desenvolvimento Científico e Tecnológico; 454332/2014-9Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; R2014/03

    Sistema de predição de alarmes em processos industriais por classificação não-supervisionada

    Get PDF
    In this work an alarm prediction system is proposed. Its main aims are to contribute to the establishment of predictive industrial maintenance guidelines and to produce a management decision support tool. The proposed system obtains readings from many sensors that are installed in the industrial plant, extract its characteristics and evaluates the equipment’s health. The diagnosis and prognosis implies in a classification of the industrial plant’s operational condition. Classification and regression trees are applied in this paper. A measurement sample from 73 sensors installed in a hydroelectric power plant is utilized to test and validate the proposed methodology. The measurements were obtained in a 15 months period.Um sistema de predição de alarmes com a finalidade de auxiliar a implantação de uma política de manutenção preditiva industrial e de constituir-se em uma ferramenta gerencial de apoio à tomada de decisão é proposto neste trabalho. O sistema adquire leituras de diversos sensores instalados na planta, extrai suas características e avalia a saúde do equipamento. O diagnóstico e prognóstico implica a classificação das condições de operação da planta. Técnicas de árvores de regressão e classificação não-supervisionada são utilizadas neste artigo. Uma amostra das medições de 73 variáveis feitas por sensores instalados em uma usina hidrelétrica foi utilizada para testar e validar a proposta. As medições foram amostradas em um período de 15 meses

    Comparing the reconstruction of regulatory pathways with distinct Bayesian networks inference methods

    Get PDF
    Background: Inference of biological networks has become an important tool in Systems Biology. Nowadays it is becoming clearer that the complexity of organisms is more related with the organization of its components in networks rather than with the individual behaviour of the components. Among various approaches for inferring networks, Bayesian Networks are very attractive due to their probabilistic nature and flexibility to incorporate interventions and extra sources of information. Recently various attempts to infer networks with different Bayesian Networks approaches were pursued. The specific interest in this paper is to compare the performance of three different inference approaches: Bayesian Networks without any modification; Bayesian Networks modified to take into account specific interventions produced during data collection; and a probabilistic hierarchical model that allows the inclusion of extra knowledge in the inference of Bayesian Networks. The inference is performed in three different types of data: (i) synthetic data obtained from a Gaussian distribution, (ii) synthetic data simulated with Netbuilder and (iii) Real data obtained in flow cytometry experiments. Results: Bayesian Networks with interventions and Bayesian Networks with inclusion of extra knowledge outperform simple Bayesian Networks in all data sets when considering the reconstruction accuracy and taking the edge directions into account. In the Real data the increase in accuracy is also observed when not taking the edge directions into account. Conclusions: Although it comes with a small extra computational cost the use of more refined Bayesian network models is justified. Both the inclusion of extra knowledge and the use of interventions have outperformed the simple Bayesian network model in simulated and Real data sets. Also, if the source of extra knowledge used in the inference is not reliable the inferred network is not deteriorated. If the extra knowledge has a good agreement with the data there is no significant difference in using the Bayesian networks with interventions or Bayesian networks with the extra knowledge

    Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior Knowledge

    No full text
    There have been various attempts to reconstruct gene regulatory networks from microarray expression data in the past. However, owing to the limited amount of independent experimental conditions and noise inherent in the measurements, the results have been rather modest so far. For this reason it seems advisable to include biological prior knowledge, related, for instance, to transcription factor binding locations in promoter regions or partially known signalling pathways from the literature. In the present paper, we consider a Bayesian approach to systematically integrate expression data with multiple sources of prior knowledge. Each source is encoded via a separate energy function, from which a prior distribution over network structures in the form of a Gibbs distribution is constructed. The hyperparameters associated with the different sources of prior knowledge, which measure the influence of the respective prior relative to the data, are sampled from the posterior distribution with MCMC. We have evaluated the proposed scheme on the yeast cell cycle and the Raf signalling pathway. Our findings quantify to what extent the inclusion of independent prior knowledge improves the network reconstruction accuracy, and the values of the hyperparameters inferred with the proposed scheme were found to be close to optimal with respect to minimizing the reconstruction error.

    Phosphoproteomics data-driven signalling network inference: Does it work?

    No full text
    The advent of global phosphoproteome profiling has led to wide phosphosite coverage and therefore the opportunity to predict kinase-substrate associations from these datasets. However, the regulatory kinase is unknown for most substrates, due to biased and incomplete database annotations. In this study we compare the performance of six pairwise measures to predict kinase-substrate associations using a data driven approach on publicly available time resolved and perturbation mass spectrometry-based phosphoproteome data. First, we validated the performance of these measures using as a reference both a literature-based phosphosite-specific protein interaction network and a predicted kinase–substrate (KS) interactions set. The overall performance in predicting kinase-substrate associations using pairwise measures across both these reference sets was poor. To expand into the wider interactome space, we applied the approach on a network comprising pairs of substrates regulated by the same kinase (substrate-substrate associations) but found the performance to be equally poor. However, the addition of a sequence similarity filter for substrate–substrate associations led to a significant boost in performance. Our findings imply that the use of a filter to reduce the search space, such as a sequence similarity filter, can be used prior to the application of network inference methods to reduce noise and boost the signal. We also find that the current gold standard for reference sets is not adequate for evaluation as it is limited and context-agnostic. Therefore, there is a need for additional evaluation methods that have increased coverage and take into consideration the context-specific nature of kinase–substrate associations

    Validation and Application of the System Code TRACE for Safety Related Investigations of Innovative Nuclear Energy Systems

    No full text
    The system code TRACE is the latest development of the U.S. Nuclear Regulatory Commission (US NRC). TRACE, developed for the analysis of operational conditions, transients and accidents of light water reactors (LWR), is a best-estimate code with two fluid, six equation models for mass, energy, and momentum conservation, and related closure models. Since TRACE is mainly applied to LWR specific issues, the validation process related to innovative nuclear systems (liquid metal cooled systems, systems operated with supercritical water, etc.) is very limited, almost not existing. In this work, essential contribution to the validation of TRACE related to lead and lead alloy cooled systems as well as systems operated with supercritical water is provided in a consistent and corporate way. In a first step, model discrepancies of the TRACE source code were removed. This inconsistencies caused the wrong prediction of the thermo physical properties of supercritical water and lead bismuth eutectic, and hence the incorrect prediction of heat transfer relevant characteristic numbers like Reynolds or Prandtl number. In addition to the correction of the models to predict these quantities, models describing the thermo physical properties of lead and Diphyl THT (synthetic heat transfer medium) were implemented. Several experiments and numerical benchmarks were used to validate the modified TRACE version. These experiments, mainly focused on wall-to-fluid heat transfer, revealed that not only the thermo physical properties are afflicted with inconsistencies but also the heat transfer models. The models for the heat transfer to liquid metals were enhanced in a way that the code can now distinguish between pipe and bundle flow by using the right correlation. The heat transfer to supercritical water was not existing in TRACE up to now. Completely new routines were implemented to overcome that issue. The comparison of the calculations to the experiments showed, on one hand, the necessity of these changes and, on the other hand, the success of the new implemented routines and functions. The predictions using the modified TRACE version were close to the experimental data. After validating the modified TRACE version, two design studies related to the Generation IV International Forum (GIF) were investigated. In the first one, a core of a lead-cooled fast reactor (LFR) was analyzed. To include the interaction between the thermal hydraulic and the neutron kinetic due to temperature and density changes, the TRACE code was coupled to the program system ERANOS2.1. The results gained with that coupled system are in accordance with theory and helped to identify sub-assemblies with the highest loads concerning fuel and cladding temperature. The second design which was investigated was the High Performance Light Water Reactor (HPLWR). Since the design of the HPLWR is not finalized, optimization of vital parameters (power, mass flow rate, etc.) are still ongoing. Since most of the parameters are affecting each other, an uncertainty and sensitivity analysis was performed. The uncertainty analysis showed the upper and lower boundaries of selected parameters, which are of importance from the safety point of view (e.g., fuel and cladding temperature, moderator temperature). The sensitivity study identified the most relevant parameters and their influence on the whole system
    corecore